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Abstract. We present an application focused on the design of resilient long- 
reach passive optical networks. We specifically consider dual-parented networks 
whereby each customer must be connected to two metro sites via local exchange 
sites. An important property of such a placement is resilience to single metro 
node failure. The objective of the application is to determine the optimal position 
of a set of metro nodes such that the total optical fibre length is minimized. We 
prove that this problem is NP-Complete. We present two alternative combinato- 
rial optimisation approaches to finding an optimal metro node placement using: 
a mixed integer linear programming (MIP) formulation of the problem; and, a 
hybrid approach that uses clustering as a preprocessing step. We consider a de- 
tailed case-study based on a network for Ireland. The hybrid approach scales well 
and finds solutions that are close to optimal, with a runtime that is two orders-of- 
magnitude better than the MIP model. 



1 Introduction 

Over the past decade telecommunications network traffic has grown exponentially at an 
average annual rate above 75% prompted by a multitude of new on-line content sharing 
applications such as Facebook and YouTube. Although the forecast for traffic growth 
over the next 5 years is reduced, it still suggests an average annual compound rate of 
about 37% with Internet video applications growing at about 47%. As High Defini- 
tion (HD) and 3D video will increasingly be delivered over the Internet, such forecasts 
do not seem to over-estimate the traffic scenario. Additionally, delivering high peak 
data rates becomes increasingly important for delivering satisfactory quality of experi- 
ence, especially for real-time services. Fiber- To-The-Premises (FTTP), and in particular 
Fiber- To- The-Home (FTTH), seems to be the only solution capable of providing scal- 
able access bandwidth for the foreseeable future. 

Passive Optical Networks (PONs) are widely recognized as an economically viable 
solution to deploy FTTP and FTTH, by virtue of the ability to share costly equipment 
and fibre among a number of customers. In particular, the Long-Reach PON (LR-PON) 
is gaining interest. LR-PON provides an economically viable solution as the number of 
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active network nodes can be reduced by two orders -of-magnitude and all electronic data 
processing can be removed from the local exchange sites, thereby reducing both cost 
and energy consumption [6 |. However, a major fault occurrence like a complete failure 
of a single metro node that terminates the LR-PON could affect tens of thousands of 
customers. Therefore, protection against a metro node failure is of primary importance 
for the LR-PON-based architecture. 

A basic and effective protection mechanism for LR-PON is to dual parent each 
system onto two metro/outer core nodes 117 121 . This is similar to a simple protection so- 
lution for IP routers known as double or redundant protection [5 1. Figure[T]shows an ex- 
ample of a PON network, together with its Wavelength-division Multiplexing (WDM) 
backbone interconnections. Each PON is dual-parented, with the dashed lines repre- 
senting the protection links. In this work we have considered protection links up to the 
first PON split (or local exchange site), leaving the "last mile" unprotected. This is a 
common choice for residential customers, while protection can be extended to the user 
premises for business customers. For example, considering Figure [T] if metro node 1 
fails, its PONs will be protected by metro node 2. Of course, node 2 needs to be over- 
provisioned with much larger IP capacity in order to protect the additional load [8]. 
Providing such a protection mechanism can significantly increase network overall cost 
because fibre deployment is a significant contributor to the total cost of the PON instal- 
lation. Therefore, we focus on the problem of finding an optimal set of positions of k 
metro nodes such that the cost of connecting optical fibres between metro nodes and 
exchange sites is minimized. The set of possible positions available for a metro node is 
the set of positions associated with the existing old local exchange sites. 



2 Problem Formalization 



We formally describe the problem of LR-PON deployment for a real geography based 
on the data provided by the Irish incumbent operator. More precisely, we present the 



definition and the complexity of the so called Single Coverage Problem where each 
exchange site is only connected to a single metro node and then present the definition 
and complexity of the Double Coverage Problem where each exchange site is connected 
to two metro nodes. 

Definition 1 (Single Coverage Problem). An instance of the Single Coverage Problem 
(SCP) is defined by (A, B, c, k, (/)), where (A, B) is a complete bipartite graph with cost 
function c such that is the cost of allocating node i G A to node j G B, k is an integer 
value such that k < \B\ and (f> is some real value. An allocation from A to M C B 
maps each node i of A to the cheapest node j of M such that j — argmirijgjvf Cij an d 
k = \M\. The total cost of the allocation is the sum of the allocation cost of each node 
of A. The problem is to verify whether there exists a subset of k nodes of B such that 
the total cost is less than or equal to (j). 

Proposition 1. The Single Coverage Problem is NP-Complete. 

Proof. A reduction from Hitting Set Problem, which is known to be NP-complete 
IH, is obtained as follows: given a collection C of subsets of a finite set S and a positive 
integer m < \S\, the HITTING Set problem is to decide whether there is a subset 
S' C S with \S'\ < m such that S' contains at least one element from each subset in 
C. The reduction to SCP, (A, B, c, k, <f>), goes as follows. We have a node in A for each 
set Si in C and a node in B for each j G S . The cost of all edges is either if j 
is in Si or 1 otherwise. We set cf> — and k = m. The constructed instance of SCP has 
a solution of cost if and only if there exists a hitting set of size m for C. ■ 

Definition 2 (Double Coverage Problem). An instance of the Double Coverage Prob- 
lem (DCP) is also defined by (A, B, c, k, (j)), where (A, B) is a complete bipartite 
graph with cost function c, where is the cost of allocating node i G A to node 
j G B, k is an integer value such that k < \B\ and </> is some real value. An al- 
location from A to M C B maps each node i of A to the cheapest node j\ of M 
such that ji = argmirijjgM c^, and to the second cheapest node ji of M such that 
]2 = arg minj 26 jvf|j 2 #ii c ih- ^ e tota l cos t of the allocation is the sum of the alloca- 
tion costs of each node of A to two nodes of B. The problem is to verify whether there 
exists a subset of k nodes of B such that the total cost is less than or equal to (j). 

Proposition 2. The Double Coverage Problem is NP-Complete. 

Proof. We can reduce SCP, which was proved to be NP-complete, to DCP by adding 
one extra node to B and setting the cost function accordingly. More precisely, let B' = 
B U {s}. Let d be the cost function such that c'^ = c^j if i G A and j G B, otherwise 
= (3 such that /3 < mhiieAjeB Cij. Solving the SCP instance (A, B,c,k,(f>) is 
equivalent to solving the DCP instance (A, B' , d, k + 1, (j) + \ A\ x /3). Notice that any 
solution of the SCP instance can be transformed into a DCP solution by setting s as 
the cheapest node for every node in A and making the SCP allocation equivalent to 
the allocation of the second cheapest node in the DCP instance. Similarly, any solution 
of the DCP instance can be transformed into a SCP solution by ignoring the cheapest 
node since the cost associated with the allocation of the cheapest nodes is bound to be 
equal to or greater than \ A\ x j3, making the allocation of the second cheapest node a 
valid solution of the SCP instance. ■ 



In this paper we focus on the double coverage problem where both A and B are 
sets of exchange sites. Let E be a set of exchange sites whose locations are fixed. In 
Figure [2] all the points are locations of exchange sites in Ireland^] Let l t be the load 
of the exchange site i 6 E which is equivalent to the number of customers that are 
connected to the exchange site i. 

Let k be the number of metro nodes 
that are required to be placed in Ireland. 
A metro node can be placed at any po- 
sition where an exchange site is located. 
Thus, the set of positions available for each 
metro node is the set of positions of all 
the exchange sites. Let d be a matrix where 
dij denotes the Euclidean distance between 
the positions of exchange sites i and j. 
In order to account for the fact that the 
amount of fibre needed to connect two 
network points is usually larger than their 
Euclidean distance, because fibre paths gen- 
erally follow the layout of the road net- 
work, a routing factor of 1.6 is applied. 
Let be the cost of connecting exchange 
site i to a metro node placed at the loca- 
tion of an exchange site j £ E, which is 
computed as follows: 

= 1.6 x d^ x cti x k. 

This cost model is based on the work of one of the authors while working at BT J6[. 
Here a, is constant and its value is dependent on the load of the exchange site i. The 
value of on decreases as the load increases since sharing of the fibre increases. The aim 
is to determine the positions of k metro nodes such that each exchange site is connected 
to two metro nodes and the sum of the costs of the connections between exchange sites 
and their respective metro nodes is minimized. 

3 MIP Model 

The objective is to place a number of metro nodes such that the cost of the connec- 
tion between the local exchanges and their corresponding metro nodes is minimized. 
The closest metro node of an exchange site is called the primary metro node while the 
second closest is called the secondary metro node. 

Constants. Let E be a set of exchange sites whose locations are fixed. Let k be the 
number of metro nodes whose positions are to be determined. Let c,j be the cost of 
connecting an exchange site i to a metro node placed at the location of an exchange site 

jeE. 

Variables. V(i, j) € E X E, Xij € {0, 1} denotes whether exchange site i is connected 
to a metro node j. Vj G E, yj € {0, 1} denotes whether j is used as a metro node. 

3 Notice that some points are outside the boundary of Ireland. This is because of the projection 
of the map of Ireland we are using in this figure. 




Fig. 2. Exchange Sites in Ireland 



Constraints. Each exchange site i £ E should be connected to two metro nodes: 



jeE 

Constraint ([T]i implicitly enforces that the primary and secondary metro nodes of 
exchange site i should be different. For each exchange site i G E its primary and 
secondary metro nodes can be inferred based on the costs of connecting i to the metro 
nodes respectively. If the metro node connected to an exchange site i e E is placed at 
the location of exchange site j then yj is one: 

Vi, j e E :yj> x^. (2) 
The number of used locations for metro nodes should be equal to k: 

vj = k - 

The number of constraints of type Q is (n 2 ) which can grow quickly for large values 
of n = \E\, in which case they can be replaced by the following weaker constraint: 

Vj eE: \E\ x Vj >J2 x v- 

Objective. The objective is to minimize the cost of the connection between local ex- 
changes and their corresponding metro nodes, i.e, 

min > ca x xa. 



4 Cluster-Based Sampling 

For the MIP model, as presented in the previous section, the set of positions of all 
exchange sites is considered as the domain of the metro node position for each exchange 
site. This may prohibit us from solving the problem optimally as the size of the set of 
the positions of the exchange sites increases. In order to overcome this scalability issue 
both in terms of time and space, we propose a heuristic approach as a preprocessing 
step for selecting a small subset of metro node positions for each exchange site and 
then use the MIP model to solve the problem optimally. 

One simple approach to overcome this could be to limit the number of metro node 
positions of each exchange site based on their distances from the exchange site. More 
precisely, select k closest/cheapest metro node positions for each exchange site e. This 
heuristic approach is called k-cheapest neighbours fKCNJ. One of the drawbacks of 
this approach is that the resulting problem can be inconsistent especially when k is 
small. Therefore, it is important to find a value of k such that the problem is satisfiable. 
Another issue is that an optimal solution of the resulting problem may not be of good 
quality despite the problem being satisfiable depending on the value of k. Obviously 



Algorithm 1 computeOverlappingClusters( E, k ) 



cost <— CO 

select k points randomly from E and assigned them to mi , 77l2j ■ • • , 1Tlk 
loop True 
While loop do 

Vi < fc : P-i <— {xj\Vi* < k : d±st(xj , rrii) < dist (xj , )} 
\/i < k : Si <— {xj < A; : dist(:Ej , m^*) < dist(:Ej , 771%) A 

\/l ^ i* : dist (xj ,mi) < dist , m; )} 
newcost <r- J2i<i< k T, Xj eP i us i W[xi]xdLst(xj r m t ) 
If cost > newcost 

cost <— newcost 

P* <- P, S* <- S 

For i = 1, . . . , k do 

Else 

loop False 
Return (P*,S*) 



when fc = \E\ we will always find the best solution but at the expense of more time. 
There is a trade-off between the value of fc and the time required to find a good solution. 

We propose a new approach for computing a sample of positions where a metro 
node can be placed at a given exchange site. This heuristic approach is called cluster- 
based sampling (CBS,). The pseudo-code is depicted in Algorithms [T] and [2] The gen- 
eral idea is to apply a variant of the k-means algorithm [4| for computing fc clusters of 
exchange sites. Whenever a local minimum is reached within the algorithm, a best ex- 
change site position, based on some criterion, is selected from each cluster as a possible 
location of the metro node for all the exchange sites within that cluster. A sample of 
positions for each exchange site is computed by repeating this process a given number 
of times. The cardinality of this set is considerably smaller than the full set of positions 
of exchange sites. Since each exchange site should be connected to two metro nodes the 
algorithm for weighted k-means clustering is adapted to ensure that each exchange site 
is in exactly two clusters. 

AlgorithmJTjcomputes fc over- 
lapping clusters. An example is pre- 
sented in Figure [3] where the value 
of fc is 5. Notice that each point is 
present in two clusters. The algo- 
rithm computeOverlappingClus- 
TERS starts by selecting fc points, 
(mi, . . . , TOfe), randomly from a given 
set of sites E. These points repre- 
sent initial fc means of the overlap- 
ping clusters. Each is associ- 
ated with two attributes: rrii-X de- 
notes the X dimension and ra, Y Fig . 3. Five overlapping clusters, 
denotes the Y dimension. Initially 

the cost is set to infinity. Each exchange site is assigned to two clusters: the one associ- 




Algorithm 2 SamplingPoints( nbruns, E, k ) 



crun 4— 

Vxj S E, ¥os(xj) <- 
While crun < nbruns do 
crun < — crun +1 

(P, S) computeOverlappingClusters(i?, k) 
For i = 1, . . . , k do 
B P j7 »(l then 

select s £ Pi such that 

Vs' £ P; S».£P.uS' W[aij] xdist(s, Xj) < J2 x -ep-us- W[xj]xdist(s' , Xj) 
Else if S, # 

select s £ Si such that 

Vs' e Si Ssc-gs. W[ay]xdist(«, < J2 m -es- W[xj]xd±at(a' , xj) 
Vx, € PiU Si : Pos(xj) <- Pos(i 3 ) U {s} 
Return Pos 



ated with the closest mean and another with the second closest mean. In the algorithm 
a cluster i is represented by Pi U Si such that m; is the closest mean for each p G Pi 
and m. L is the second closest mean for each p E Si. We use dist(pi,pj) to denote 
the Euclidean distance between the points pi and pj and w[p^] to denote the weight 
associated with a site pi, which is equivalent to ati x li for our problem. The cost is 
evaluated by summing the weighted distances of all the points of the clusters with re- 
spect to their corresponding means. If the new cost is less than the current cost then the 
new means are calculated for all clusters. While the new cost is better than the previous 
cost the assignment of the exchange sites to two clusters and the update of the means 
is repeated. The algorithm returns the tuple (P*, S*). The complexity of each iteration 
within the while loop of Algorithm[T|is 0(n k), where n is the number of sites and k is 
the required number of metro nodes (or the number of clusters). 

The input of SamplingPoints (Algorithm|2]) are nbruns, k and_E. Here nbruns 
denotes the number of times the overlapping clusters should be computed, k denotes the 
number of clusters and E denotes the set of exchange sites. Posfij) denotes a set of 
metro node positions of an exchange site Xj. Initially, Pos(ij) is an empty set for each 
exchange site x r First, computeOverlappingClusters is invoked which returns 
a set of overlapping clusters such that each exchange site is present in exactly two clus- 
ters. Recall that a cluster (of exchange sites) i is denoted by Pi U Si. After that an 
element s is selected from each cluster as a possible metro node position for all the 
exchange sites within Pi U 5,;. Also recall that Pi U Si means that the selected metro 
node s is the cheapest/closest for each e € Pi and it is second cheapest for each e € Si. 
Therefore, if Pi ^ then s is selected from Pi such that the sum of the weighted dis- 
tances between s and all the exchange sites of the cluster i is minimum. Otherwise it 
is selected from Si. This entire procedure is repeated nbruns times. Algorithm[T]can 
be seen as a variant of weighted k-means clustering algorithm. The main difference is 
that in the original algorithm clusters are pairwise mutually exclusive but Algorithm [T] 
computes overlapping clusters as required by the problem. 



Table 1. Results for 19 metro nodes. Table 2. Results for 20 metro nodes. 
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seconds) 


\E\ 


optimal 


CBS (GAP) 


MIP 


CBS 




l-BI 


optimal 


CBS (GAP) 


MIP 


CBS 


100 


470,439,821 


0% 


0.25 


0.71 




100 


456,703,030 


0% 


0.27 


0.69 


200 


475,779,040 


0% 


2.29 


1.59 




200 


462,745,384 


0% 


2.26 


1.68 


300 


476,876,335 


0.000% 


36.90 


3.15 




300 


463,322,390 


0.001% 


14.84 


2.88 


400 


477,736,761 


0.009% 


52.76 


4.40 




400 


464,669,197 


0% 


70.46 


4.57 


500 


476,930,454 


0.014% 


96.89 


6.59 




500 


464,018,395 


0.001% 


115.40 


6.78 


600 


476,860,839 


0.013% 


168.47 


8.49 




600 


464,181,132 


0.034% 


226.34 


11.00 


700 


477,825,864 


0.012% 


1,277.27 


14.68 




700 


464,696,666 


0.006% 


405.57 


11.49 


800 


477,432,981 


0.033% 


498.29 


17.43 




800 


464,576,759 


0.034% 


661.01 


19.25 


900 


477,608,042 


0.019% 


817.24 


20.09 




900 


464,918,687 


0.039% 


1,108.49 


27.26 


1000 


477,730,261 


0.029% 


1,081.61 


32.78 




1000 


464,968,787 


0.028% 


1,587.39 


31.74 


1100 


477,789,473 


0.038% 


1,716.27 


39.32 




1100 


465,066,168 


0.034% 


8,777.75 


54.53 



5 Empirical Results 

In this section we investigate different approaches for solving the problem of determin- 
ing locations of metro nodes in Ireland. 

We used CPLEX for solving all the integer linear programming formulation of the 
instances of the double coverage problem. All of our algorithms were implemented in 
Java. In our experiments, we varied the number of metro nodes between 18 and 24 for 
Ireland. The results are reported for 19, 20, 23 and 24 metro nodes. The original problem 
had 1100 exchange sites. In order to do systematic experimentation, we generated 10 
instances of smaller sizes. These instances are representative of the original instance 
since they were generated by applying k-means algorithm on the original instance by 
varying k (or the number of required exchange sites) from 100 to 1000 in steps of 100. 
All the experiments were run on Linux 2.6.25 x64 on a Dual Quad Core Xeon CPU 
with overall 1 1 .76 GB of RAM and processor speed of 2.66GHz. 



Table 3. Results for 23 metro nodes. Table 4. Results for 24 metro nodes. 
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\E\ 


optimal 


CBS (GAP) 


MIP 


CBS 




\E\ 


optimal 


CBS (GAP) 


MIP 


CBS 


100 


421,504,120 


0% 


0.24 


0.72 




100 


411,560,864 


0% 


0.34 


0.71 


200 


429,159,208 


0.048% 


7.63 


2.19 




200 


419,088,008 


0% 


9.04 


2.15 


300 


429,880,291 


0% 


19.82 


2.85 




300 


420,069,722 


0.005% 


23.64 


2.93 


400 


430,115,650 


0.005% 


161.45 


5.41 




400 


419,722,195 


0% 


68.85 


4.88 


500 


430,043,176 


0.001% 


350.34 


9.18 




500 


419,700,725 


0% 


182.44 


7.39 


600 


429,866,927 


0.033% 


713.30 


10.52 




600 


419,773,717 


0.039% 


293.45 


9.48 


700 


430,802,977 


0.019% 


1,761.50 


18.17 




700 


420,102,946 


0.008% 


903.16 


11.71 


800 


430,755,591 


0.011% 


2,631.64 


21.79 




800 


420,352,288 


0.007% 


1,532.55 


16.61 


900 


430,737,706 


0.024% 


3,858.39 


30.84 




900 


420,235,833 


0.011% 


1,752.59 


21.46 


1000 


430,918,149 


0.008% 


7,537.79 


39.18 




1000 


420,317,577 


0.009% 


3,657.55 


25.93 


1 100 


430,839,593 


0.026% 


9,706.40 


36.61 




1 100 


420,347,707 


0.025% 


4,316.71 


33.29 



The results for MIP are presented in Tables [T]|4] All the experiments for this ap- 
proach were run to completion. The optimal values computed using this approach are 
shown under the column named "optimal". The results in terms of time (in seconds) are 
also reported. In terms of time this was the most expensive approach especially when 
the number of exchange sites is more than 500. 
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Fig. 4. KCN approach for 20 metro nodes and 500 exchange sites: (a) Time required for different 
values of k. (b) Optimal value for different values of k 



Although the KCN approach may solve a problem instance quicker than the MIP 
approach, one issue is to determine the right value of k. A small k may result in making 
the problem inconsistent and a large k may result in spending more time. Also, despite 
having a satisfiable problem when k is set to a relatively lower value, it can still result 
in spending more time than that required for solving the original problem, when all the 
positions are considered for all exchange sites. This is illustrated in Figure[4]by plotting 
the results for solving an instance of the double coverage problem where the number of 
exchange sites is 500 and the number of metro nodes is 20. For both Figures |4(a) and 
4(b) the x-axis denotes the value of fc, which is varied from 2 to 500 in steps of 2. The y- 



axis of Figure 4(a) is the time required to solve the instance and the y-axis of Figure 4(b) 
is the optimal value corresponding to k. Notice that when k is less than or equal to 
46 the problem is always unsatisfiable. An interesting point to observe is that when k 
is between 48 and 56 the time required to solve can be up to 2 orders-of-magnitude 
more than that required when k is 500. Also notice that when k is set to 84 an optimal 
solution is discovered and the time required to find an optimal solution is also the least. 
The results of the KCN approach are not reported in Tables T]|4 for two reasons. First, 
determining the right value of k is not always possible and additionally there is an 
overhead. Second, the other hybrid approach CBS almost always outperforms KCN in 
terms of time without degrading the quality of the solution. 

The advantage of the CBS approach is that if an original instance is satisfiable then a 
modified instance obtained by CBS is also satisfiable. Another advantage is that it does 
not enforce any lower bound restriction on the domain size of the metro node positions 
for any exchange site. An upper bound restriction is implicitly imposed by the parameter 
nbruns which is equal to the number of times Algorithm[T]is invoked for computing 
overlapping clusters. The application of cluster-based sampling for discarding a set of 
metro node positions for each exchange site before the search starts can be an overhead. 
However, it pays off since the time required for search reduces significantly without 
sacrificing the quality of the solution as shown in Tables [T]|4] For harder instances it 
requires almost two orders-of-magnitude less time than that of the MIP approach. Also 



the gap between the cost of the optimal solution and the cost of the best solution found 
using CBS is within 0.05% of the optimal value, which is extremely low. 

6 Conclusions and Future Work 

We have studied and solved the double coverage problem arising in long reach passive 
optical networks that are robust to single node failures. We showed that the double cov- 
erage problem is NP-Complete. In order to minimize the total length of optical fibre that 
connects metro nodes and exchange sites we modeled the problem using mixed integer 
linear programming. We proposed and studied a hybrid approach that performs cluster- 
based sampling as a preprocessing step in order to reduce the possiblities of metro node 
positions for exchange sites. We showed that the hybrid approach can reduce the time 
required to solve the double coverage problem by up to two orders-of-magnitude, espe- 
cially when the size of the problem instance is large. Our study also shows that the best 
solutions obtained by using the hybrid approach CBS are almost optimal. 

The related work to our contribution in this paper is the work on dual-homing pro- 
tection using MIP [9 1 and local search [3|. Although the comparison with a MIP ap- 
proach is done, the comparison with a local search approach is one of the future works. 
In future we would also like to extend our approaches so that they allow us to specify 
the reach of the metro nodes. Consequently, this may make some problem instances in- 
consistent. Therefore it would also be interesting to extend the problem definition where 
only a given percentage of total customers are required to be dually covered. 
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